DOCUMENT RESUME 

ED 291 771 TM Oil 056 



AUTHOR 
TITLE 
PUB DATE 
NOTE 

PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Webber, Charles F. 

Program Evaluation: A Review and Synthesis. 
87 

28p. 

Information Analyses (070) 
MF01/PC02 Plus Postage. 

^Educational Assessment; Ethics; Evaluation 
Utilization; Holistic Evaluation; Needs Assessment; 
^Program Evaluation; Qualitative Research; 
Statistical Analysis 
Stakeholder Evaluation 



ABSTRACT 

This paper reviews models of program evaluation. 
Major topics and issues found in the evaluation literature include 
quantitative versus qualitative approaches, identification and 
involvement of stakeholders, formulation of research questions, 
collection of data, analysis and interpretation of data, reporting of 
results, evaluation utilization, and ethical issues in program 
evaluation. There appears to be a trend toward the synthesis of 
program evaluation models. This focus incorporates concern for 
rigorous design, a combining of quantitative and qualitative 
methodologies, respect for the perspectives of all participants, 
pragmatism, active stakeholder involvement, and social value. The 
features of this focus proceed through the following steps: (1) study 
of program content; (2) establishment of stakeholder commitment and 
involvement; (3) focusing of the evaluation; (4) formulation of an 
evaluative design; (5) data collection; (6) holistic and statistical 
analyses; (7) interpretation of results; and (8) program 
modification. The process is circular and continuous. (TJH) 



************************************************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 
*********************************************************************** 



ERIC 




PROGRi\M EVALUATION: 
A REVIEi; AN J S\?CTHESIS 

by 

Charles F. VJebber, Ph.D. 

1987 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) " 



U S OEPAfTTMENT OF EDUCATION 

OHtce o< Educational Research and improvemenj 

EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 



eproduced as 
)r organi2atfOn 



X This document has been 
^ received from the person 

Originating it 
C MmOf changes have been made to impfo 

reproduction quality 

• points of view or opinions staled m thisdi 
mer>t do r>ot necessarily r»»present otfn 
OERl position or policy 



ERIC 



BEST COPY AVAILABLE 



liiLforont, models of i:-nxir(im rrv.jlii.it lui. {k.wu booi 'jC'V(>lu[A<i -r .i \jM.^t/ 
uf fuiictionb. Goa]-or..Mitod ovaluaLion; fur example, fit., situations in 
uhicii participants, wish to assess student progress and monitor effectiveness 
of particular innovations. Decision-oriented evaluation focuses on informed 
decision making. Transactional evaluation is organized around program processes 
and the value positions held by key participants. Efforts to explain educational 
effects and devise instructional strategies can be categorized as evaluation 
research. Goal-free evaluation represents efforts to assess program effects 
without being limited by a program's conceptual framework. Adversary evaluation 
offers competing stakeholders access to the same program information (Cronbach, 
1982; Morris &.Fitz-Gibbon, 1978). 

Program evaluations are also commissioned for more general reasons. 
Stakeholder uncertainty and confusion can be allayed. Decisions can be made 
in a more timely, systematic and informed muoner. Evaluators can help clients 
avoid planning based on what is fashionable. Understanding of educational 
diversity and complexity can evolve. Evaluations offer interest groups a 
common language, i.e. terms, data models, and orientations, that add coherence 
to discussions. Program evaluation helps an organization refocus when there 
is a danger of a program, per se, supplanting client need as a raison d'etre. 
Programs can be rendered more efficient, productive, and effective. Program 
credibility and acceptance can be augmented through the gathering of supportive 
evidence (Cronbach, 1982; Bennet 8. Lumsdaine, 1975; Sarason, 1982; Stake, 1967; 
Weiss ^ Bucuvalas, 1977). 

One justif icatxon offered for program evaluation is accountability. 
Tliis can be interpreted as teacher, student or program accountability. 
Oilinf f ourDtructb asoOciaLed witli accountalnli ty arc productivity, cost- 
ef fcctivoncss, quality control, .ind improved standards. 7hese concepts can 



ERIC 



3 



I 



be useful and relevant tu pc;Iicy ..hripcrc. f^ov,rv.?r/ <ic;{;fn;i)'.dLi 1 1 1\ om nnply 

"that evGrybody, regardless uf qualif ica^u lon-w iias the right to Oas^ judqoment 

on the teacher's performancG" (Bowers, p. 138;. More positively, taken- 

for-granted beliefs underlying programs can be critically examined, new 

issues can surface, thinking can be revised, and policies can be redefined. 

Program participants can be encouraged to assume responsibility for their 

own situations (Bowers, 1974; House, 1980; Sarason, 1982; Keiss L Bucuvalas, 1977). 

Program contextual variance is nearly limitless- Site adaptation or 
the "mutation phenomenon" (Berman & McLaughlin, 1974, p, 10) means that any 
one evaluation will encounter a plethora of influences. Therefore, features 
from a variety of models have been incorporated into the following discussion, 
which has been influenced by the v/orks of L,J. Cronbach, E,R, House, E.G. Guba, 
¥•5, Lincoln, and M,D, Patton. 

Quantitative Versus Qualitative Evaluation Research 
The traditional view of evaluation research concentrates on results 
and follows three steps: 

"!• Two or more conditions are in place, at least one nf them being 

the conseguence of deliberate intervention, 
2* Persons or institutions are assigned to conditions in a way that 

creates equivalent groups, 
3. All participants are assigned on the same outcome measure(s)" 

(Cronbach, 1982, p, 24). 
One example of this view is the systems analysis approach, which directs 
results toward managers and economists. It assunes program goals are agreed 
upon by all stakeholders, that ojuse--»rrcct linkages can be established, 
and that outcome variable:, can bo qu^intifird; crficioncy v> its il (ilou-^e. 



ERLC 



4 



(1^)78) compares traditional evaluation rcso..r( to tl.c natural 
<noncG format of hypothetico-deductivc niothodoloqy . Thi-. dominant ^icxio 
characterizes good resGdrch as incorporating quantitative mGahuremerit , 
oxporimontal design, and multivariate, parametric statistical analysis, 
rneso characteristics have been transposed from the basic statistical and 
experimental techniques of -^agricultural experimentations (Partlett % Hamilton, 
1976; Patton, 1978). 

Evaluation research has moved away from the view that the only worthwhile 
study is one which yields reliable, quantifiable data (Cronbach, 1982; Cronbach, 
Gt. al., 1980; Guba & Lincoln, ]981; House, 1980; Patton, 1978, 1980). 
Qualitative research has become more common and increasingly accepted. 
i\irportedly, evaluators now understand that people formulate plans, values, 
and purposes which are influenced by emotions/ cultures, and life experiences. 

Evaluators with an anthropological bent believe modern, pluralistic 
societies must be examined through in-depth/ open-onded interviews and personal 
observations which yield qualitative data that can be analyzed holistically. 
Thiey say this type of inquiry leads to understanding of particular situations 
as opposed to predictive validity. Qualitative evaluators insist that site 
variation negates the applicability of quantitative results to other settings. 
Instead, evaluations should take advantage of local conditions and serendipity 
in their efforts to roach understanding ^Agor, 1980; Coortz, 1973; Guba r. 
Lincoln, 1981; House, 1980; Patton, 1978; Pelto \ Pol to, 1978; Powdormaker, 
1966). 

Case studios oxomplify thu utility of qualitative studios because thoy 
arc politically sensitive and "more likely to \)C jttondod to than arp tyr.ical 



ERLC 



5 



f'\aluulion reports^' (:;ub<j Iincc/.ii, /'f!., p. u'D. (\ c i.tu^lic^ l s-u..- .,ri 

Oivcrsc prograni uof.ccts and load Lo cumpuiriily ir^rnincj. I iKo ^ /VH) '-,.J^ 

this makch the investigator a teacher who helps refine clients^ perc^-^tior.^ 

of a program, instead of trying to establish ana maintain e^^aluator pov/cr 

and expertise. Thu^, the evaluator-teacher comes to value, just etlinographers 

do, .abjective ddta garnered through observation and interpretation of participant 

behaviors (Cronbach, 1982). 

Although he highly values the case study approach. House (1980, p. 2^17) 
cautions tha^ it is "no panacea and entails a distinctive set of problems 
of its o\vn.'» Case study theory and methodology should be carefully examined 
(House, 1980; Patton, 1978). With these i_autionary notes in mind, Cuba and 
Lincoln (198], p. 377) extol the freedom of case studies -.v-hich allow^s situational 
"vibes" to be picked up and explored. This, they feel, is not as possible 
in tightly controlled studies which insist that everything be scientifically 
documented; that is, the "case study provides a vehicle for the transference 
of that kind of wordless knowledge" (Cuba & Lincoln, 1931, p, 377). 

Guba and Lincoln (1981) have three suggestions for gualitative evaluators. 
First, case study records should be kept current and clearly show defensible 
links between raw data and conclusions. Second, interviews should be carefully 
documented to show they were reliably and validly conducted. Third, instead of 
apologizing for the subjective nature of case studies, evaluators should list 
the advantages inlierent in the subjectivity: 

- Questions can be restated if not initially understood. 

- Interviews are personal, 

- sensitive topics can U: Joait \.ith cuiupas .lonutPly . 

- ITic affectivo responses of informants can Ix- notod. 



ERLC 



6 



- \'oiivcrb. 1 c^lucb ran bf» st.uciio-.. 

- Kicli contextual iiiromation Ca.> ijo cjithcrcd. 

'xliGre arc similarities botv.ccn tiic experimental and naturalistic evaluation 
camps. UTiilG experimental evaluators give more cre^'3nce to quantified data 
than the naturalistic evaluators would, they )x)th believe that societv sliould 
attempt to progress. Further, they both acknowledge that change can be well 
mtentioned but harmful. Technical similarities include a shared concern Zoi 
sampling, question formulation, and quality. D/aluations siiould cross freely 
between the two categories because one is not better than the other; choice 
should be made on the basis of suitability to the program under study (Cronbach, 
1982; Cronbach, et. al . , 1980; Bennett L Lumsdaine, 1975; Kuhii, ]£70; Patton, 1980). 
As Patton (1978, p. 235) says, "Tliere is no single factor or set of factors 
that can solve the mystery of human behavior, no one answer to the furidamental 
philosophical question: why do people do what they do? (^or is there a 
single answer to that most fundamental of governmental questions: how do 
we get people to lo what we want them to do?)" 

Tliere have been calls for the use of multiple data gathering methods 
(Campbell & Fiske, 1959; Cronbach, 1982; Cronbach, et. al., ]980; Denzin, 1978; 
Cuba ^ Lincoln, 1981; Patton, 1980). Triangulation, as this is called, allows 
different aspects of a program to surface. Cronbach, et. al . (:980, p. 222) 
state, "Those who advocate an evaluation plan devoid of one kind of information 
or ifie otlier carry the burden of justifying sucii exclusion." 

Identifying and Involving Stakeholders 
Significant program studies will produce results that will increase 
or decrea^,e the pow f individual interest groups. Evaluation researchers 
can anticipate that individual:; v;hosc power is irifTcasod win .uprx>rt and 



ERLC 



7 



dofond the sludv jnd vice vni'-.A '-Uii-d . lir.ccWii; Of) \i, .^/dlui^im Mould 

bc^ uiidortak^n onl\ ;.-,en 1]\q fx^Iitical .-y.tcm iridicat^ . it ::i\o seriv.uo 

cofibiJoralion to robu'.ts ^Gnerat/?d by tliO -Dtudy. Tlic poiiLical systeni may 
include qroup^ as diverse as voLerS; managors/ operalmq personnei, and policy 
makers, \vhich means considerable effort rtust be exerted dv the evaluator to 
iscertam levels of audience receptivity ^Cronbctch, 1982; Cuba S Lincoln, :981). 

Cronbach (1982) says the degree to which an evaluation is successful 
is the extent to which the interest groups are able to resolve conflicts 
intelligently. This may seem contrary to Cuba and Lincoln's (1981, p. 299) 
statement that, "Evaluation is always disruptive of the prevailing political 
balance." Fiov^ver, this becomes clear when the level of political dissent 
is seen as a question of degree. In other words, a successful program evaluation 
will reduce the level of stakeholder disharmony over a particular issue, 
though It cannot expect to satisfy everyone involved. 

If an evaluation is to reduce disharmony, then the evaluator should 
involve stakeholders in the identification of contentious issues (Cronbach, 
1932; Cuba & Lincoln, 1981; Patton, 1978). Further, the evaluator should try 
to become well acquainted with stakeholders* beliefs and values because 
"different people have different appetites for different information" (Stake, 
1973, p. 304). 

Guba and Lincoln (i98i, p. 308) suggest three things to consider during 
interest group identification: 

"1. U*ho are the presumed direct beneficiaries of the evaluand^ 
2. hTio are the indirect bf.^nef iciar ios? 

-i. V.r.at groups might, ar. a rnsulL of the taalualion, Ix' pprr,::aacd to 
adopt or adapt the evj]u<Uid in thriiT own r,ottinqs''" 



ERLC 



8 



(;oni>idcrinq noq.jti\o offocls v.ill iLso holf; idontif\ M..ji:e!iolcioi\o. 

i allure Lo involve audieru:es dt the beginning of .jn evaluation ma> 
automatically cause critical questions lo be (Overlooked, study re-.ults to be 
suspect, and methods to be criticized* It may also be unfair to audience 
members because, "The act of evaluation provides a political legitimation 
difficult to achieve in other ways" (Cuba Lincoln, 1981, p, 306), Exclusion 
of crucial audiences may misapply that legitimacy. However, some audiences 
may have motives tor trying to derail an evaluation* 

Audience involvement means persons with the power tc facilitate or hinder 
entree (the stage where stakeholder commitmen*- is sought) need to be committed 
to the study. These strategically located individuals are called "gatekeepers" 
(Guba & Lincoln, 1981, p. 290). Each gatekeeper will require an explanation 
of the evaluation before the research can effectively continue. 

Gatekeepers do not have to be people of formal position and authority. 
iTiey should, however, be enthusiastic, committed, competent, interested, and 
aggressive. Fatten (1978, p. 71) suggests that "more may be accomplished 
by working with a lower level person ... than in working with a passive, 
disinterested person in a higher posit'.on.*' Failure to work with gatekeepers 
can mean an evaluation is really not targeted at all, resulting in reduced 
utilization potential. 

Involving gatekeepers who have a genuine interest in research data is 
called considering the "persmal factor" (Patton, 1978, p. 70; House, 1980, 
p. 64). Recognizing the personal factor shows an understanding of how decision 
making is a personal and political process, rather than strictly a scientific 
^nci rdtional process. 

Varied audience information needs imply multiple criteria; multiple 



ERLC 



9 



8 

inothods of data collections; difrcrcnt styles of data <iiialysis, intcrprotdtion, 
and reportincj; and varied satisfaction uitli study conclubions (liouso, 1980). 
This wide range of interests and needs stresses the importance of formulatinq 
a contract with evaluation audiences (Cronbach, 1982; Cuba & Lincoln, 1981). 
Minimally, contracts should address the following issues, regardless of contract 
complexity: "identification of tne sponsor or client, identification of the 
entity to be evaluated, pu/pose of the evaluation, sanction (from relevant 
parties), audiences, methods of inquiry, emergent design, access to records, 
confidentiality and anonymity, evaluator autonomy, reporting, and technical 
specifications" (Cuba & Lincoln, 1981, p. 271-282). 

Formulating Research Questions 
Scriven (1967) delineated two types of evaluation - formative and summative* 
Formative evaluation t^-ies to collect information for program development and 
improvement and summative leads to more final judgments about program effective- 
ness. Patton (1978) suggests Scriven *s distinction between formative and 
sumnative evaluation may be artificial. Instead, all evaluations can be 
viewed as formative. That is, evaluation of a program^s outcomes can and should 
be used formatively by asking why the program was effective, thus assisting 
others who may be considering implementing or improving the same or a similar 
prograin. Nevertheless, the forma tive^-summative distinction is widely accepted 
as depicted in the figure below, developed at the Center for the Study of 
Evaluation (CSE), University of California, Los Angeles: 



1 


^2 


3 


4 


Needs 


Program 


Formative 


Summati ve 


Assessment 


Planning 


Lvaluat ion 


f'Valuat ion 



stages of th( n Evaluation Model (Morris >S, Fi Lz-Gibbfjn, I'tTfs, p. u) 



ERIC 



10 



') 

The formativc-oumrndL u'o distmcticJii is Lnijxjrtctnl \^\^ov ovalujlori. fonnubitu 
i^valuation questions. lO do this thc\' must consider evaluation purposes, 
LnformrJtion uses, types and amounts of evaluation data likely to bo generated, 
and alternative actions that \;ill be open to decision makers once the evaluation 
is complete. Tliis stage is followed by one in which evaluators decide whether 
appropriate data can be gathered to answer a question, the degree to whica 
the research questions predetermines or suggests answers, how badly decision 
makers want or need the answers for their om or others' decisions, and how 
relevant the question is for future action (Cronbach, 1982; Patton, 1978) • 

Program goals can be a source of questions, especially during early 
phases of evaluations. However, program goals may be politically decorative, 
not necessarily reflecting real goal^ nor hinting at unwanted side effects 
(Cronbach, 1982; Cronbach, et. al., 1980; Schultze, 1968), If goals are used 
as a source of questions, then evaluators would do well to look for goals 
that are unrealistic, particularly well met, or ones program staff did not 
even try to reach. Cronbach (1982) also warns against setting quantitative 
expectations for goal attainment. He suggests satisfactory levels should 
be negotiated after the assessment, when quantitative data can be seen in 
light of the qualitative data also garnered. 

In the initial stage of formulating evaluation questions, many issues 
will be highlighted, \\hen investigators begin to plan the kinds of observations 
they need to make, they usually see that far more should be done than can be 
done. Patton (1978, pp. 80-81) claims, in reference to university based 
investigations, that, "Professors have trouble getting graduate students 
to analy/o less tlkui th^ v/hole of };uman experience in their diLbcrt<itrjr)S. " 
Thio gcr-^ral sentiment is echoed by Cronbach, et. aK {]''Vu)) who sucfqest 



ERLC 



11 



llKit Clio ov.jludtioii or ovc^ji d scrio:.. uf ^'vaiuaLion^, v.ill ufici an ciajunoPil 
.jlx;ul a prcAjrain. Inbtcad of trvinq Lo do t ou much, pro^jTram cvaJaacor^. ^huuld 
ronoml^r that relevance is important and limit their investigations accordmql 

Those who foe] they uould like to investigate a large nujiiber of issues 
should consider several things. First, even though efforts will probably 
ijo made to give equal attention to all questions, some will necessarily end 
up gettinq more. Thus, focusing will occur and better that it be guided 
than haphazard. Second, investigators, especially neophytss, may become 
overwhelmed at the task: they have assigned themselves and become unable to 
do a good job on any part. Tliird, small-scale evaluations can more quickly 
and easily depict the usefulness of program evaluacion; future studies can 
then expect more support than if large-scale opes had failed. linall}", the 
audience will not need nor want to know absolutely everything about a program, 
^valuators should be cognizant o^ *dience attention span (Borg Gall, 1979; 
Cronbach, 1982; Patton, 1978). 

One danger of attempting too ?\uch is that by the time the study is done 
the situation may have changed and the results be of little more than passing 
interest. Ideally, a study will provide accurate and perceptive infonriation 
to decision makers when i^ will be more useful (Cronbach, 1982; Marris 8. Rein, 
1973). 

The influence study data have on subsequent decisions is called leverage 
(Cronbach, 1082; Cronbach, ct. al., lOSO; fatten, 1078). Leverage is critical 
because rigorous methodology, sophisticated statistical analyses, and large 
samples are worthless if they yield useless infojrmction. Determining leverage 
in\-()lvos considering Iho politics of the issue and the ip:port of l\v decisions 
affected by the question, f'rograms sometimes have -.u much politjcal badcing 



LhaL oven if evidence of t nuii morfecLi vonc>-/. is ui .cuvorod it have 

eracjc • F.Vdluatorr? L^'l'iould Liiink CcirciTully aix)uu cxjj'^'iiuiii'f oiiCiAjv uri 
iii\'e^/ T^jafions or pro-jrams that arc? fji-inly ensconced in the political milieu 
{i . 1973). 

Cronbdch (1982) offers a type of priority scale for deploying investigative 
uf Tort. Hib conKGrits have been transformed into this figure: 





high leverage 


low leverage 


high prior 


A. essential 


B . 1 imi ted resource 


uncertainty 


to include 


investment 


lov/ prior 


C. only low cost 


D. investigate only 


uncertainty 


information should 


incidentally 




be gathered 





Priority scale for evaluation questions 



Issues which fall into block A must be addressed if the evaluations' 
credibility is to be maintained. Block B issues should not be allocated a 
high resource expenditure unless initial investigations yield data so compelling 
tliat the issue gets pushed into block A.. The same can be said for block C. 
Category D question: include those which would yield information that would 
be interesting but not useful for decision making. 

Flexibility should be built into an evaluation plan so that, as events 
unfold, new variables can be identified as politically salient and worthy 
of inclusion. Initially, it should bu presumed that variables which participants 
say are important do have leverage (Cronbacli, et. al., 1980). 

No matter how carefully potential object iotxS are envisioned and countered, 
unpalatable findings will be attacked from some perspective. Designing a 
program evaluation is a political process jhd if one of the decisio\ makers 
does not Wrint a qnosf- ir)r]c; jtu:wrM^rvi, [[ 'ill find ways to denigrate the an.swors 



ERIC 



13 



uliich evolve, 'ihib underlines llie imporLance or invuU'incj deci:nion makers 
in tne formulation of research questions and of keeping the dcsiqr. open 
as possible, even though such tentativeness ma\ bo uncomfortable for those 
used to rigid hypothesis-testing research. K^ien a research question is 
challenged by part of the policy shaping community, the evaluator must 
decide if the [X)litical imbalance and human uncertainty likely to result 
if the question is pursued are worth provoking (Cronbach, et. al., 1980; 
Cuba ^ Lincoln, 1981; Patton, 1978). 

Evaluators cannot answer all questions and reduce all uncertainty. All 
an evaluator can hope to achieve is to shed some new lights add some additional 
information r and increase certainty in some small way (Cronbach, 1982; Cronbachr 
et. al., 1980). 



Gathering Data 

There appears to be no single best plan for a particular study even when 
factors of time and budget are considered. Thusr formulating study designs 
can be viewed as learning by both evaluators an^^ stakeholders. This is particularly 
encouraging for novice evaluators vmo can utilize their relative freedom of 
movement among all segments of the policy shaping community to genera' ^ a 
picture of the situation more complete than what other participants could 
generate (Cronbach, 1982; Cronbachr et. al.r 1980). 

Involving stakeholders in the planning of data gathering \;ill encourage 
stakeholder commitment. It will also be possible to anticipate satisfaction 
and dissatisfaction with potential findings and to incorporate appropriate 
strategies. Investigators should guard against being reduced to technicians 
by stakeholders, merely applying sampling measurement and staticlical analysis 
skills fCronbach, .982; Cuba Lincoln, inMi; Pitton, 197P). 



ERIC 



14 



1 ' 

Lvaluation rcbcarchcrs must also c'nuu^^ ^ ijol\:ccn fidoIit\ (.iiVD\.cr roiiubi lit:\') 
and baiidv/idth (the nimiber of questions fcjr v.iiK^h ans\.\jr'. arc offered)* 
Uliilo focusing resources on one issue improve fidelity, it is more often 

appropriate to strike a balance with bandv;idth. The degree of balance \;ill 
vary with each situation and will be deternined cooperatively b> the evaluator 
and che poliry shaping comriunity (Cronbach, 1982). 

Audiences need to be avrare of influences \;ielded by a program's [xjlitical 
environment, decision maker uncertainty, and information availability. Itiey 
can then recognize that evaluation findings can be usef\:l in particular settings 
but need to be viewed tentatively when generalized "lo other settings. They 
will realize that the actual study is substituting for the ideal study that 
would supply irrefutable data (Cronbach, 1982; Patton, 1978). 

Because there are always more methods of inguiry possible thar. resources 
available, the range of choices must be reduced. This narrowing will be helped 
by developing a thorough understanding of the setting: community characteristics, 
organizational characteristics, staff peculiarities, etc. Further, the purposes 
of the study need to be re-examined as does the proposed timeline. These 
considerations will strongly influence the number of variables examined, plus 
the size and nature of the sample(s) to be studied (Cronbach, 1902 ; Patton , 
1978). 

Patton (1978) and Cronbach (1982) say studies should be sensitive to 
local conditions, not mechanically objective. Tlie understanding they seek 
comos from being close to the situation, i.e. subjective, resulting in a 
personalized evaluation, more legitimate in the eyes of program participants. 
Tliis can lielp identify interesting and iinportant program features that ^;ould 
not have been initially noted (Cronbacli, 198^; Cook, lOP,]; Herryman Glciinan, 



ERIC 



15 



: 1 

1980). Cronbacii (]9C2) cjocs :-;o far .1- [ (j -.jy IhuL it ..oi.Ia ho uinialur-il for 
programn or experimental broatmonts to ».o inplGmonLod icionl Lr-dlly in difforcMit 
sites. Hxcanining a procjvim quali tal ivc:l\ :rio:iri^ the jrifluoDces o^ participant 
biases and experiences can be noted, a.^ can degrees and nuances of variation 
fron site to site. This is tlie "ecological correlation" (Cronbacn, 1982, 
p. 99) between a setting and a program. 

Focusing on site variation and adaptation of a program tends bo nulce 
a study less able to be replicated than an e>:perimental design Cronbach 
(1982, p. 293) advises the evaluator to "^imple those strata that he thinks 
will predominate in the future," v;hich is not a strategy used in formal experimental 
design, although it could be part of a quasi -experimental one. Patton (1980, 
p. 101) agrees and suggests it is possible for "decision makers and evaluators 
(to) think through v;hat cases they could learn the most from, and those are the 
cases that should be selected for study." He goes on to say that the least 
desirable type of sampling is based on convenience. 

Attempting to understand program diversity in different settings means, 
according to Cronbach (1982), evaluation plans and operations cannot be 
rigidly fixed. However, doing this means results cannot be generalized beyond 
the group or situation under scrutiny. To avoid this restriction, an investigator 
can employ random sampling so the data collected will more likely be representative 
of the larger population. Sample size would depend on the amount one would 
wish to generalize teyond X and the amount of error acceptable (Cronbach, 1982; 
Patton, 1980). It is important to renember that the virtues of strong designs 
should not prompt anyone to think of them as the only worchwhile design 
(Cronbach, et. al., lOBu). 

Analy-^is, Tntprpri^tatiorw :md Reporting of Pata 
All stakeholders should continue to havc.^ input throuqliout the latter 



ERLC 



16 



(^Vcilualion st^igcs. ridrincrs require Uiil -^v.ilu.itors., \.ho hdvc? ronr>ultoct \.iL!: 
audiences earlier in the piocesS/ not innoro stal.-cliolclors until tiic filial 
report. Continuous evaluation feedbad; \.ill constitute more of a loarnin-^j 
process for evaluation participants. Supplying information to only a portion 
of the policy sliapinq community provide:D that txjrtion Milh po\:er in the form 
of knowledge control. Evaluators should strive to advise the entire audience 
by seeing that information is thoroughly disseminated and explained. Investigators 
cannot hope to remove all doubt or confusion surrounding a program by v/riting 
a summary report. Issues are generally too complex for this to happen. 
Unless audiences are kept abreast of the findings as they are uncovered, they 
may not accept a final report. Evaluation surprises of tliis sort may in fact 
increase uncertainty rather than reduce it (Cron^ach, 1982; Cronbach, et. al . , 
1980; Cuba 8. Lincoln, 1981; Patton, 1978). 

Cuba and Lincoln (1981) say that the complexity of data reporting may 
vary from stakeholder to stakeholder, particularly \:hen audience sophistication 
varies. Efforts should be made to inform and explain so partisans do not 
interpret findings incorrectly and unwisely. Some portions of a program may 
be doing well and should be praised before shortcomings are identified, 
Cvaluators should consider audience attention span and limit analyses and 
data sets that probably will not affect decision making, while still preserving 
a display of evidence r.nd reasoning. Also, having interest groups help in 
data interpretation will help air differences, misconceptions, and uncertainties 
(Cronbach, 1982; Cronbach, et. al., 1980), 

Evaluators operating from an exT>erimentnl or quasi -experimental basis 
tend to believe that presentauiun uf data to interebt groups will ensure 
utilization. Others lean more to the view thdt pouplr are also influoncod 



o 17 



by tlieir provioub oxix:r] oncoL., l:iu:. vicwiirj data bub ^cLivoly fCool: \ Carnpbtli, 
I97G; Cronbach; 1902; Cronbdch; et. '^SO; Patton, i07f;). Audience uiidor- 

standmg can be assisted by relating data to other knowledge areas, like 
folklore^ history, commux^.ity experiences, and common sense (Cronbach, :982; 
Lakatos, 1970; Lindblom ^ Cohen, 1979; Weiss Bucuvalas, 1977). 

Evdluatorb can incidentally collect colorful and realistic material 
that will make interim and final reports more vivid. Anecdotal information 
can lend immeasurably to credibility and utilization potential of study conclusions. 
Ihis is especially so when information consumers are relatively ignorant of 
the analyses usually performed upon data by academics (Cronbach, 1982; Cronbach, 
et. al., 1980). Patton (:978, p. 234) claims this humanistic touch has 
"received little more than lipservice in most evaluation research." 

House (1980) clearly separates analysis and interpretation. He says 
analysis is the organization of data, construction of statistical tables, and 
arrangement for presentation. Interpretation is the act of making judgments 
about what the data mean. Scriven (1967) suggests the evaluator draw conclusions 
while Cronbach (1982) states that the evaluator should only present data, 
not making any definitive policy suggestions. House (1980), though, says 
both should be done, but with the boundaries of each clearly demarcated. 
This way, audiences can view the analysis separately without the evaluator 's 
interpretations. Cronbach (1982) says that interpreting any type of data 
conclusively is unrealistic because of site adaptation. This lends credence 
to Cuba and Lincoln's (1981, p. 381) statement that "if the evaluator and the 
client interact in producting judgments and recommendations, that i^v if the 
judgmpnts and recommendations rire produced Lhrouqh a proces<- of rieqot iat ion, 
then each one can make a projx^r contribution from a posture of integrity." 



18 



hOrt VAUch data does an oval uator mco befuro prcst^ni. ld.: j f in il rin.urt ' 
'iliO Ix'ttf3r the dGScriptioii of a proqrcm the scAinder arc ^.\\o judrjinont^ ba^^cd 
upon Lhat description. On the other hand, tlio longer a study take.-, tho more 
chance there is of the situation changing, thus rendering results irrelevant 
(Cronbach, 1982; Berrinian & Glennan, 19CU; Thompson u King, I9Clj. Developing 
programs need direction as they unfold, not after the fact. Decisions need 
to be made without full knov;ledge of their ranif ications. The best evaluators 
can do is to be aware of their limitations, relate data to general experience 
and theory, and act on this basis when decision makers need assistance. 

Program evaluation has been criticized for concentrating on the negative 
(Freeman, 1977; House 1980). This may be partially du to program evaluations 
touching political and organizational ner\'es (Cronbach, et. al . , 1980). I^atton 
(1978) and fiouse (1980) believe that the real question is not whether information 
is positive or negative but v;hether it is useful to decision makers. 

Evaluation Utilization 

The literature contains a range of definitions of evaluation utilization. 
Weiss and Bucuvalas (1977) suggest that utilization has occurred if the evaluator 
gathers information that advances the decision making process. This can take 
the form of dramatic and immediate program changes (Alkin, Daillak, IVhite, 
1979; Brown & Braskamp, 1980; Weiss, 1980). This can even result from the 
efforts of inexperienced evaluators (Cichon, et. al . , 1981). Weiss (1972) 
claims that the rarity of immediate change has contributed to the notion 
that evaluations have little impact. 

However, effects can be more subtle. Knowledge can be gradually assimilated 
into clients* understandings of important issues. Larger issues may be 
raised, future studies may be suggested, additional questions may irise. 



ERLC 



19 



problonis CiV be cl^rifiea/ cxpcc-^ution- cwi be ]:col lOcAi.ii^:, and pcrcoot ions 
altcrorj (Airiui, et. al., :^^70; cichoiw "^1.. al., ]9ol; 'aivoiw 197^.; l.'eiss 
i>ucu\alas, ]977; Young i ^^omtuis, 1979). 

The Kinds of effects that an evaluatior; Ivis will vjry according to client 
expectations, the quality of the inf orpiaticp. ;jathered by the evaluator, the 
client-evaluator relationship, and deqree of stakeholder involvement. Tt 
may not be possible to see any external evidence of change because only client 
perceptions may have been altered, ^rhus, future decisions are indeed influenced, 
although the connection to the evaluation may not be obvious. Intei^'estingly , 
effects do not seem to depend upon the use of a formal evaluation model 
(Alkin, et. al., 1979). 

Evaluation data must compete v;ith oth^r information sources, rr lends, 
colleagues, past experiences, ana biases influence the degree to which evaluation 
data are seen as useful fAikin, et, al,, 1979; Guskin, 1980; Weiss, 1980). 
For example, information users give more credibility to reports from male 
evaluators than female evaluators, the use of jargon in reports adds to 
evaluator credibility, and use of the word "researcher" rather than "evaluator" 
or "content area specialist" creates an impression of objectivity. Also, the 
closer the audience is to the decision -making role, the more critical it is 
of the evaluator (Newman, Brom, ^ Braskamp, 1980). 

Womack (1980) says evaluation reports do iiot compete successfully with 
other information sources because they are often prepared for professional 
and not client use. Weiss (1980, p. 23]) says this constitutes "poor linkage 
betv/eon rcscarcher'j and decision mukcr^." 

Knorr (1980) says evaluation data can be u..ed in four uays. rirst, it 



ERLC 



20 



(an becume a ba':>Q for fuluro docu^ion LauJii;^. 'J:( jr<'\, r uiu.ice inintudi<ilo 

jrlion. Third, decision maker-, can uso it to ncroly t:r('atc Lli:> jniurcs.->ion 
that something is being done. Fourth, it can be selcctivoly used to legitimate 
[X):icy decisions already made. ^Hie first two uses fit the definitions of 
utilization most often encountered in the literature, but the last tv;o have 
been described as obstacles to utilization (Cichon, et. al., 1981; Weiss, 1980; 
Williams Bank, 1981). Other obstacles to ntilization include the inability 
of decision makers to make their needs explicit; inadequate communication 
channels between evaluators and their clients; decision maker unwillingness 
to accept evidence that contradicts personal beliefs; the fluidity of commufiity 
influences, and a poor match between evaluator re^-Ajrts and audience sophistication 
(Alkin, et. al., 1979; Weiss, 1972, 1980). 

Evaluation utilization is not constrained by social science methodology 
or the changing nature of public issues. Therefore, there remains a relatively 
untapped potential for decision maker use of evaluation infonaation (W-eiss, 
1980). 

Evaluators have three roles from v/hich they can choose: teacher, assistant 
and judge. The most critical of these is the teacher role if evaluation 
utilization is to be maximized (Cichon, et. al., 1981; Guskin, 1980; Patrick, 
McCann & Whitney, 1981; Weiss, 1972; V^ise, 1980; Young & Comtois, 1979). 
An evaluator in this role must be able to: 

- Initiate and maintain interesb group cooperation, conmiitmenL and 
involvement. 

- Implement the concept of tri angulation. 

- i:eep client expectations realistic anci clear. 

- h^ip Clients articulate their need£.. 



ERLC 



21 



~ HGport firiciiiK).^ in cin undcrsL^iryM^jLj m.^ru-V. 

- Present data at upprooriale rinio-. 

- Work closely v;ith client groups. 

- Establish and maintain a positive \,orking relationsliip with clients. 

- Be flexible. 

- Communicate ^.ell with interest groups. 

- Facilitate client ouTiership of evaluation results. 

- Recognize study limitations. 

- Focus on client needs instead of the interests of professional colleagues. 

- Take an active role in getting data utilized. 

- Employ appropriate research methodology. (Berman, 198 Cichon, et. al . , 
1981; '^avid, 1978; Gifford, 1974; Guskin, 1980; Patrick, et. al., 1981; Ueiss, 
1972, 1980; Valliams & Bank, 1981; Wise, 1980; Young Comtois, 1979). 

The scope of these characteristics underlines the dit.iculty of fostering 
utilization of evaluation findings. Utilization will not automatically happen; 
it requires "ingenuity, resourcefulness, and commitment" (Caplan, 1980, p. 9). 
Further, the nature of these characteristics has prompted Weiss (1980, p. 
245) to warn against "the inappropriate acceptance of the results promoted 
by the most persuasive or charismatic communicator." 

Ethics of Procfram Evaluation 
Since leverage and credibility are inextricably linked (Cronbacli, 1982), 
several authors (Care, 1978; Cronbach, et. al,, 1980; Cuba & Lincoln, 1981; 
Patton, 1978; Stake, 1977) have sought to delineate guidelines helpful in the 
maintenance of ovaluator crodibility. Tho.^o hav«: been gathered into the 
following list of protessioridj considor.Jtionr:; fov pL-C)gr<im (^valuators: 



9.2 



- otakohojdor qroup^* parLicipatiori Vic LVciiu^no' ,n (iiCMr 
o\.i\ volition. 

- SLaKeholders ^jhouid te quaruntccd zlicir ri^fwi to lirre ^riouu ii Lo Mio 
evaluative process . 

- Interest groups should bo encouraccd to honor their conimitraont to tiio 
evaluation that thev have expressed tlirough their mvolveracnt or 
renegotiate it through agreed upon channels. 

- The agreement or contract should not favor one political entity over 
another, whether because of evaluator carelessness or the sponsor 
v;ithholding politically significant information. 

- Stakeholders require full and equal access to tiic arcuinulation of data 
and to periodic reports from the evaluator. 

- The best interests of the participants should be protected throughout 
the study. 

- Anonymity should be negotiated v;ith subjects before gathering data. 
i\hen anonymity cannot be guaranteed/ subjects should }cno\; in advance, 
lliey should realize that in studies with small samples it may be possible 
to identify informants through descriptions or quotations, even though 
they were not explicitly identified. This can happen despite efforts 

to combine elements from several cases into a representative case. 
Evaluators should protect informants by maintaining coded file systems 
so that individual identities cannot be ascertaiiied by other:.. 

- Evaluators should keep audience expectations realistic, given study 
constraints. 

- Study purposes :.huuld be explicitly stated . 

- T[io evaluation must [lavo social value. 



ERIC 



P3 



I 



Ihere is a trend touard .--//htiiCbi^ ol :.rocjram evp.lualion models, 'riii? 
focus incorporates concern for rigorous design, a combininy of quantitative 
and qualitative methodologies, respect Tor the perspectives of all participants, 
pragmatism, active stakeholder involvement, and social value. 

Tiiese Features are incorporated into the following process model: 



STUDY 
PROGRAM COXTEXT 



APWT .\XD 
MODIFY 
PROGRAM 





lOTERPRET 
RESULTS 



ESTABLISH STAKEHOLDER 
COMMITMENT A:;D 
IXVOLVEMEXT 



holistic and 
statistical 6. 
a>:alyses 




3. FOCUS THE 
EVAI.UATION 



4 . FORMUIJ\TE 
DESIGN 



GATHER DATA 

Program Evaluation Process Model 

Evaluator-participant interaction is inherent in each stage. The arrows 
indicate that progress is not automatic. It may be necessary to retrace 
one or more of the steps before ar evaluation can continue. The circular 
arrangement depicts the continuous nature of evaluation. 



ERLC 



24 



\.]CL' , '!.!:. ^y;Si,). 'ho profcsbiunai stranger , :.c\. \ort:: \cadGnuc iYos:^;. 

Mkm, :i.C., Daillak, R., Sprite, p. ( 1979). Lsin:j evaluations . De^^crly 
Hills: Sage. 

eniiGtt, C.A. S Luiribdaine, A. A. (1975). Social progran] evaluation: Definitions 
and issies. In C*A. Bennett & A. A. Lunsdaine (Cds.), Evaluation and expcr indent : 
Some critical issues in assessing social programs . Xew York: Academic 
Press. 

P.erman, P. (1981). Educational change: .An implementation paradigir.. In 
Pv. Lehming & M. Kane (Eds.), Improving schools: Using vhat ve knov:. 
Beverly Ilills: Sage. 

rerman, P., 5c McLaughlin, M.L. (1980). Federal programs supporting educational 
change (Vol. 2). Santa Monica: Rand Corporation. 

Berryman, S.E., Glennan, T.K., Jr. (1980). An improved strategy of 

evaluating federal programs in education. In J. Pincus (Ed,), Educational 
evaluation in the public policy setting . Santa Monica: Rand Corporation. 

Fx3rg, K.R. M.D. Gall. (1979). Educational research . New York: Longman. 

Bowers, C.A. (1974). Cultural literacy for freedom . Eugene, OR: Elan. 

BroTO, R.D., Eraskamp, L.A. (1980). Summary: Common themes and a 

checklist. In L.A. Braskamp 8. R.D. Brovm (Eds.), rCew directions for program 
evaluation . San Francisco: Jossey-Bass. 

Campbell, D.T., Fiske, D.W. (1959). Convergent and discriminant validation 
by the mult i trait-mult imethod matrix. Psychological Bulletin , 56(2), 81-105. 

Caplan, N. (1980). ;Vhat do we know about knowledge utilization? In L.A. 
Braskamp ^ R.D. Brown (Eds.), New directions for program evaluation . 
San Francisco: Jossey-Bass. 

Care, N.S. (1978). Participation and policy. Ethics, 88 (4), 316-337. 

Cichon, D.J., Callahan, C, & Singh, B. (1981, April). Impact of a process 
evaluation on an urban school system's policies and practices . Paper 
presented at the Annual Meeting of the American Educational Research 
Association (65th), Los Angeles, CA. 

Cook, T.D. (1981). Dilemmas in evaluation of social programs. In M.B. 

Hrcvor ^ R.R, Collins (Eds.), Scientific incTuiry and the social sciences , 
ban Francisco: Jo3sey-Ba:.s. 

Cook, T.D., Campbell, D.T. (1976). The design and conduct of guasi- 

experiments and true experiments in field settings. In M.D. Dunne tte 

(Ed.), Handbook of industrial and organizational psychology . Chicago: 
Rand McNally. 



ERLC 



* 



programs , ban Irancisco: Jossoy-Has^). 

(*ronb:ich, L.J., Robinson r\nibron, 5., Durucubch, i'0')b, H.l)., llornilw 

II. C, Phillips, D.C., Walker, D.F., :.ciaor, b.S. v]030). ]b\;arci 
reform of program evaluation , San Fr^^.iiciscc: Jossoy-Bass. 

David, J.L. (1978). Local use of title I evaluations . Washington, D.C: 

Office of Education (DHEW), Office of i'la.ming. Budgeting, and Evaluation. 
(ERIC Document Reproduction Service Xo. ED 187 727). 

Denzin, X.K. (1978). The research act . \ev York: McGra\--Hill . 

Freeman, H.E. (1977). The present status of evaluation research. In M. 
Guttentag (Ed.), Evaluation stucfies (Vol. 2). Beverly Hills: Sage. 

Geertz, C. (1973). The interpretation of cultures . Ne\; York: Basic Books. 

Gifford, B.R. (1974), December 3). Restructuring the collection, processing 
and dissemination of educational data: An action plan for chan ge. 
Brooklyn, NY: New York City Board of Education. (ERIC Document Reproduction 
Service No. ED 136 829). 

Guba, E.G., Lincoln, Y.S. (1981). Effective evaluation . San Francisco: 
Jossey-Bass. 

Guskin, A.E. (1980). Knowledge utilization and power in univf^rsity decision 
making. In L.A. Braskamp & R.D. BrouTi (Eds.), New directions for program 
evaluation . San Francisco: Jossey-Bass. 

House, E.R. (1973). Epilogue: Can public schools be evaluated'^ In E.R. 
House (Ed.), School evaluation: The politics and process . Berkeley, 
C.A: McCutchan. 

House, E.R. (1980). Evaluating with validity . Beverly Hills: Sage. 

I^orr, K.D. (1977). Policy makers' use of social science knowledge: Symbolic 
or instrumental? In C.H. Weiss (Ed.), Using social research in public 
policy making . Lexington, MA: Lexington Books. 

Kuhn, T. (1970). The structure of scientific revolutions . Chicago: University 
of Chicago Press. 

Lakatos, I. (1970). Falsification and the methodology of scientific research 
programmes. In I. Lakatos L A. Musgrave (Eds.), Criticism and the growth 
of knowledge . London: Cambridge University Press. 

Lindblom, C.E., & Cohen, D.K. (1979). Usable kno\/ledge New Haven; Yale 
University Press. 

Marris, P., S< Rein, M. (1073). Dilemmas of social reform . Chicago: Aldine. 



ERIC 



?6 



ilJlls: Saqo. 

Yc^.Tian, D.L., Bro\m, R.D., S Craskamp, L.A. ( '/JV>0) . Comiunication theory 
and the utilization of evaluation. In '..A. Cras]:an;[. R,D. Brown (Eds.), 
New directions for program evaluation . San Francisco: Jos' ey-Bass. 

E'artlett, M. , Hamilton, D. (1975). lA^aluation as illumination: A new 
approach to the study of innovatory programs. In G.V. Glass (Ed.), 
Evaluation studies: Annual review (\'ol. I). Beverly Hills: Sage. 

Patrick, E., McCann, R., & hTiitney, D. (1981). The dissemination linking 
process: A view from the regional exchange . Washi^igton, rx:: Xational 
Institute of Education. (ERIC Document Reproduction Service No. ED 
206 662). 

Patton, M.Q. (1980). Qualitative evaluation methods . Beverly Hills: Sage. 

Patton, M.Q. (1978). Utilization-focused evaluation . '3everly Hills: Sage. 

PeltO/ P.J., & Pelto, G.H. (1978). Anthropological research . Cambridge: 
Cambridge University Press. 

Powdermaker, H. (1966). Stranger and friend . New York: W.W. Norton. 

Sarason, S.B. (1982). Ttie culture of the school and the problem of change 
(2nd ed.). Boston: Allyn and Bacon. 

Schultze^ C.L. (1968). The politics and economics of public spending . 
Washington, DC: The Brookings Institution. 

Scriven, M. (1967). The methodology of evaluation. In R.W. Tyler (Ed.)/ 
Perspectives of curriculum evaluation . Chicago: Rand McNally. 

Stake, R.E. (1973). Evaluation design, instrumentation, data collection, 

and analysis of data. In B.R. Worthen & J.R. Sanders (Eds.), Educational 
evaluation: Theory and practice . Worthington, OH: Charles A. Jones. 

Stake, R.E. (1977). Responsive evaluation. In R.K. Tyler (Ed.), Beyond 
the numbers game . P^^^rkeley : McCutchan . 

Stake, R.E. (1967). Toward a technology for the evaluation of educational 
programs. In R.K. Tyler (Ed.), Perspectives of curriculum evaluation . 
Chicago: Rand McNally. 

Thompson, B. / & King, J. A. (1981, April). Evaluation utilization; A 

literature review and research agenda . Was} Ington, DC: National Institute 
of p:ducation. (ERIC Document Reproduction Service No. ED 199 271). 

lyicr, R.W. (1967). Changing concepts of educational evaluation. In I^.W. 

Tyler (Ed.), Perspectives of curriculum evaluation . Chicacjo: Rand McNally. 



9.7 



Ueiss, CAl. (1072). Evaluation rc:::LVirc!i . luqicn-oocl ClirC:-,; i^ronticr^- 
I^all. 

l.oisS/ C.fl. (1980). Social science rescarcii dad decisioii-makinc) . \c:\' Vork: 
Columbia University Press. 

Weiss, C.H./ & Bucuvalas, M.J. (1977), Ihe challenge of social research 

to decision making. In C.H, Weiss (Ed.), Using social research in public 
policy making . Lexington, NLA: Lexington Books. 

WilliaiTio, R.C. & Bank, A. (1981), Use of data to improve instruction in 
local school districts: Problems and possibilities. In C.B. Aslanian 
( Ed . ) , Improving educational evaluation methods; Impact on policy . 
Beverly Hills: Sage. 

Wise, A.E. (1979). Legislated learning: the bureaucratization of the 
American classroom . Benceley: University of California Press. 

Wise, R.I. (1980). The evaluator as educator. In L.A. Braskamp S< R.D. Bro™ 
(Eds.), K fiW directions for program evaluation . San Francisco: Jossey- 
Bass. 

Womack, T.A. (1967). Educational evaluation: Administrative function. 
In W.H. Strevell (Ed.), Rational of education evaluation . Washington, 
D.C: Office of Education, Bureau of Elementary and Secondary Education 
(DHEIV). (ERIC Document Reproduction Service No. ED 034 292). 

Young, C.J., & Comtois, J. (1979). Increasing congressional utilization of 
evaluation. In P.M. Zweig (Ed.), Evaluation in legislation . Beverly 
Hills: Sage. 



?8 



